Random graph model with power-law distributed triangle subgraphs 
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Clustering is well-known to play a prominent role in the description and understanding of complex networks, 
and a large spectrum of tools and ideas have been introduced to this end. In particular, it has been recognized 
that the abundance of small subgraphs is important. Here, we study the arrangement of triangles in a model 
for scale-free random graphs and determine the asymptotic behavior of the clustering coefficient, the average 
number of triangles, as well as the number of triangles attached to the vertex of maximum degree. We prove that 
triangles are power-law distributed among vertices and characterized by both vertex and edge coagulation when 
the degree exponent satisfies 2 < /3 < 2.5; furthermore, a finite density of triangles appears as /3 = 2 + 1/3. 



PACS numbers: 89.75.Da, 89.75.Fb, 89.75.Hc 

Graph representation is extensively used in many branches 
of science in order to reduce the complexity of systems whose 
components have pairwise interactions and where distance 
is irrelevant. One associates the components of the system 
with the vertices of a graph and connects two of them by 
an edge whenever a given property holds. It has turned out 
that real-world networks, ranging from biology to physics, 
display common topological features and, importantly, their 
degrees, power-law distributed (i.e., the number of vertices 
with k edges goes as k^^ for some )3 > 2, called the degree 
exponent), reflect the presence of self-organizing phenomena 
underlying their architecture 1 1 ] . Owing to their power-law 
degree distribution such networks are usually referred to as 
scale-free networks 01, i.e., with no intrinsic characteristic 
degree. 

A number of models aiming at understanding the fea- 
tures of complex networks have been proposed, for instance 
to cite a few. In this work we focus on a model 
for power-law random graphs 1 8 ] giving good insight into the 
clustering properties. We demonstrate that triangles coagulate 
into clusters and, in contrast to classical models for random 
graphs (see |9] for a review), they are power-law distributed: 
the probability for a randomly selected vertex to participate in 
t triangles goes as ~ f"(i+i^)/2^ with j3 being the degree ex- 
ponent. This scaling relation suggests that triangles might be 
regarded as a fundamental element for the characterization of 
real- world networks. 

Our motivation resides in the recent attention devoted to the 
occurrence of small subgraphs, or motifs, in scale-free net- 
works. It has been observed |10, 11] that some motifs are 
over-represented in real-world networks as compared to ran- 
domized networks with the same degree distribution. Usu- 
ally the triangle is the building block of most motifs and for 
random regular graphs it has been remarked 1 12] that when 
one imposes a finite density of triangles, they have the ten- 
dency (i.e., higher probability) to organize themselves into 
complete subgraphs. Surprisingly, this phenomenon is more 
likely when the imposed density of triangles is small. 

Our interest in triangles is also motivated by their interplay 
with a simple transitivity relation and the fact that the cluster- 
ing coefficient can be used for breaking graphs up into clusters 
caiTying coherent information. The clustering coefficient for a 
given vertex / with degree ki is defined as 1 3 ] C, = 2f,/ [k^ — kj). 



ti being the number of triangles attached to vertex i. Clus- 
ters are obtained by fixing a threshold value and removing all 
vertices, and edges incident to them, with C, falling below it. 
This scheme was applied to detect interest communities in the 
World Wide Web [13.1 which turned out to be strongly affected 
by the presence of co-links. This means that double edges with 
opposite direction are part of a triangle with high probability, 
in line with findings in 1 1 1], and thus emerge as the basic unit 
of transitivity. A similar approach has also been employed 
to organize lexical information into semantic classes in order 
to differentiate meanings of ambiguous words U^ll- Further- 
more, related fines of research 1 15, T^, I?', [isl [ll ^ IH 
have stressed the importance and the abundance of cycles (or 
loops) in scale-free networks. 

The model. The best known model for random graphs is 
the Erdos-Renyi model ''^{n,p) in which every graph con- 
sists of n vertices and each pair is connected by an edge with 
uniform, independent probability p. The topology of such 
graphs, however, shows marked deviations from that observed 
in real-world networks. For instance, if p — the de- 

grees are Poisson distributed, that is, the probability for a 
randomly selected vertex to have k edges is given by |9, 2'^ 
P{k) ^ {X''lk\)e-^, where X is the average degree; further- 
more, triangles are almost surely (i.e., with probability equal 
to one in the asymptotic limit) both edge and vertex disjoint. 

Here, we investigate a generalization of the Erdos-Renyi 
model which exhibits a power-law degree distribution. In our 
analysis we shall follow closely Refs. 0] to which we refer 
the reader for more details. 

So, consider the set of random graphs '^^{w) in which ev- 
ery graph is specified by the average degree sequence w — 
(wi, . . . ,w„) aiTanged in decreasing order: w\ > W2 > • • • > 
w„. Two vertices / and j are connected with probability 
Pij = WiWj/Y.i'^i = pwiWj, where 1/p = L/Li*^/- Impor- 
tantly, by setting 

w,=c{i + k)-'l(P-') (1) 

the number of vertices with degree k turns out to be propor- 
tional to k^^ , and as a result the degrees are power-law dis- 
tributed with degree exponent j3 . The constants c and /q ap- 
pearing in Eq. Q are determined by the average degree d and 
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the maximum degree m. For )3 > 2 one finds 
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and 1 + io = n 



d_P_-2\P 
mj5-l 



Probability normalization requires that < and so 

m < c/'/^n'/^. In this model the average degree is a free 
parameter and in the following we will assume that d > 1 ; as 
a consequence, the maximum degree scales with n as 



and 



< a < i 
~ 2 



(2) 



Remark that a can be chosen independently of /3 . Yet, another 
quantity of interest is the second-order average degree d = 
P Li in terms of which we shall express most of our results. 
In the asymptotic limit we have 1 8 1 : 
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making apparent the existence of three different regimes as a 
function of the degree exponent. 

Results. The average number of triangles f, attached to 
vertex / is f; = I.j>k j^i k^iPijPjkPki ■ This sum may be re- 
arranged as 
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In all regimes the leading term arises from the first (double) 
sum in the right-hand side of the above expression. We find 



that ti/wf = pc/ /2 is of order 0{n 
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/^)) if 2 <J3 <3, 



of order 0{n (logn) ) if j3 = 3, and of order 0{n ) if 
j3 > 3. Neglected terms are at most of order 0{n^^m^^P) 
if 2 < 15 < 3, at most of order 0{n^^) if j3 = 3, and at most of 
order 0{n-^m^-P) if j3 > 3 Q. It readily follows that in the 
asymptotic limit the average clustering coefficient of vertex / 
reads 
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and for sufficiently large values of w, this can be regarded as 
independent of the degree of the anchor vertex. C, can be in- 
terpreted as the probability that two neighbors of a vertex of 
degree w, are joined together by an edge. By making use of 
Eqs. (|2} and Q one finds how the clustering coefficient scales 
with the number of vertices n in the asymptotic limit. The av- 
erage number of triangles attached to the vertex of maximum 
degree is simply given by ti ~ p{dm)^/2. The results in the 
asymptotic limit are summarized in Tab.|l] 

The average number of triangles T is obtained by calculat- 
ing 



TABLE I: Asymptotic behavior of the clustering coefficient, C, the 
average number of triangles, T, and the number of triangles attached 
to the vertex of maximum degree, fj, as a function of the degree 
exponent /3. Recall that m ^ n" with < a < 1/2. 





2<p <3 


j8=3 


/3 > 3 


c 




--^ (logm)2 n^' 




T 




--^ (logm)^ 


<^ oo 


h 




--^ m2(logm)2 n^' 


2 -I 
m n 



As before, the dominant term arises from the first term in 
the right-hand side of the above expression, cP/3\, of order 
C)(OT3(3-i5)) if 2 < j3 < 3, of order 0((logn)-^) if j3 = 3, and 
of order (9(1) if j3 > 3. The other ones are at most of order 
C»(m2(3-/^)) if 2 < j3 < 3, at most of order O(logn) if j3 = 3, 
and at most of order 0{m^^P) if j3 > 3 O. The asymptotic 
behavior of T as a function of m and n for the different regimes 
is also shown in Tab.U 

We next address the question of how triangles are dis- 
tributed over the graph. Starting from a simple calculation 
proves that the probability for a randomly selected vertex to 
participate in t triangles goes as 
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and thus triangles are power-law distributed among vertices. 

Discussion. Some remarks on Tab. I are in order We see 
that irrespective of the choice of a, Eq. the clustering 
coefficient remains a decreasing function of n for j3 > 2, that 
is always smaller than 1, and thus it preserves its probabilistic 
interpretation. The number of triangles always diverges with n 
in the range 2 < j3 < 3, corresponding to the regime observed 
in real- world networks (see 1 1 ] for examples); if instead j3 > 3 
then there are a finite number of triangles, as in the Erdos- 
Renyi model. From Tab. I we can also see that a = 1 /2 seems 
to be a natural choice, and hence we set a equal to this value 
from here on. 

Eq. Q is our main result. This scaling relation tells us that 
with non-negligible probability some vertices participate in 
a large number of triangles, which implies that they are not 
scattered over the whole graph, as in the Erdos-Renyi model, 
but coagulate around some vertices. Further understanding of 
such a phenomenon can be gained by studying the inequality 
ti > Wi/2, leading to / < 0(1) x n{cP-/n)P^^ - /q. Triangles 
start sharing a common edge when (n/io) x (tP/«)^^' is at 
least of order 0(1), that is, for 2 < j3 < 2.5. Furthermore, the 
number of vertices at which edge coagulation occurs goes as 
n-(P-P-)iP-P^) + 0(«(3-/i)/2) with /3± = (3 ± V5)/2. 

Note that as j3 approaches 2 the vertex of maximum degree 
sees around itself a tightly connected cloud since the cluster- 
ing coefficient is close to being constant, whereas for j3 > 3 
triangles are sparse in its neighborhood. In contrast, by look- 
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ing at the fraction of triangles attached to it, that is 

r„(/5-3)/2 if2<j3<3 

(log«)-' ifi8=3 
[ < °c if j3 > 3 

we deduce that triangles are spread over the graph for 2 < 
j3 < 3, and essentially centered around the vertex of maximum 
degree otherwise. Another quantity of interest is the density 
of triangles, namely T /n ~ n^i^^P)/^+^/^ for 2 < )3 < 3, and 
as )3 = 2 + 1/3 we have a finite density of triangles. 
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t: Number of triangles 

FIG. 1: The number of vertices participating in a given number of 
triangles as obtained from simulations for j3 = 2.2. The number of 
vertices in graphs is set to n = 2 x 10"*, the maximum degree to m = 
\/n, and the average degree to = (j3 — l)/(/3 — 2). Averages are 
taken over 200 realizations and the scale of axes is logarithmic. The 
linear fit yields 5^, = 1.71 ±0.02. For other values of j3 the results 
are summarized in Tab.Hll Inset: The degree distribution f (fc). The 
solid line has slope —2.29 ±0.02. 



Simulations have been performed in order to study the scal- 
ing relation of Eq. (|5} as a function of j3 . Figure ^ illus- 
trates the results for /3 = 2.2. Points obtained from simula- 
tions clearly follow a power-law with a cut-off as f approaches 
t\ « 56; the measured exponent 5m is in accordance with the 
theoretical value. Table [II] shows the results for other values 
of j3. Finite size effects are more marked as j3 approaches 3. 
The reason is that the number of triangles and, in particular, t\ , 
which determines the cut-off, increase with n at a slower rate 
(see Tab.UJi. In that respect it is worth noticing that for j3 ~ 2.2 
and « = 2 X lO'* vertices we have m sa 141 and t\ k, 56, and 
edge coagulation does not occur since t\ > m/2 does not hold. 
This is a finite size effect since for n = 10^ vertices we would 
have m ^ 3, 162 and fi « 4,033, and the condition for edge 
coagulation is fulfilled. To make this point clearer we have 
investigated numerically fj as a function of «; the results are 
shown in Fig.|2]and we see a good agreement between simula- 
tions and theoretical predictions in the different regimes. Ob- 



viously, the power-law behavior breaks down in the presence 
of a small, finite number of triangles on average, i.e. )3 > 3. 

TABLE II: The exponent characterizing the distribution of triangles 
among vertices, Eq. Jsj, resulting from simulations as a function of 
the degree exponent jS. Here Sm and Si denote the measured and 
theoretical values, respectively. 
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1.83± 0.04 
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1.65 


1.75 
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FIG. 2: The dependence of /j, the average number of triangles at- 
tached to the vertex of maximum degree, on n, the graph size, as 
obtained from simulations for jS =2.2 (circles) and 2.5 (squares). 
Comparison is made with the theoretical prediction ti = p{dm)^/2 
(solid lines). As before d = [fi — l)/(/3 — 2) and m = y^; averages 
are taken over 200 configurations and the scale of axes is logarith- 
mic. The linear fit yields a slope of 0.71 ± 0.01 for /3 = 2.2 and of 
0.48 ± 0.01 for j3 = 2.5; the theoretical value is given by 3 — /3 (cf. 
Tab.m. The dashed line corresponds to t\ = \/n/2 marking the tran- 
sition to edge coagulation. Notice that for /3 = 2.5 there is no edge 
coagulation, whereas for j8 = 2.2 there is; indeed, points obtained 
from simulations cross the dashed line. Inset: t\ as a function of n 
for /3 = 3 (triangles) and 3.4 (diamonds). Solid lines correspond to 
the theoretical predictions. 



We point out that the coagulation phenomenon reported 
in f]^ and the one investigated here are of quite a different 
character. Specifically, in regular graph models the number 
of graphs with a finite density of triangles is small and corre- 
spond statistically to graphs obtained by placing triangles so 
that to construct the largest complete subgraph. Conversely, in 
power-law random graphs it turns out that triangles are signif- 
icant on average and display statistical regularities. The com- 
mon feature is that as topology departs from a certain degree 
of randomness it gives rise to a pressure towards clustering 
and triangles arrange themselves accordingly. 
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It is possible to make contact with models making use of 
fitness variables. In Refs. L7t i25.1 two vertices / and j are con- 
nected with probability f{xi,Xj), where x, and xj denote the 
intrinsic fitness of / and j, respectively. Fitness of vertices is 
distributed according to h{x). Within this model the number 
of triangles attached to a vertex of fitness x is 

2 oo 2 

((x) = \ I f{x,y)f{y,z).f{z,x)h{y)h{z)AyAz = \g{x) . 
2 Jo 2 

It follows that the probability for a randomly selected vertex 
to participate in f triangles can be written as 

The statistical properties of graphs arise from the choice of / 
and h and one can prove that for a particular choice this model 
is equivalent to the one studied here. We leave a detailed dis- 
cussion to a future publication. 

A generalization of the model investigated here would con- 
sist in implementing a non-trivial dependence of the clustering 
coefficient on the degree. Note, however, that the mechanisms 
responsible for clustering are basically the same, and in the 
case of a clustering coefficient decreasing with the degree k 
as C ~ k-^ we have P{t) ~ f-(i+i3-r)/(2-r). We address the 



reader to Ref. f2^ for a study of the presence of this scaling 
relation in biological networks. The purpose of |26] was to 
establish a duality between large-scale topological organiza- 
tion and local subgraph structure in empirical networks. Our 
analysis differs from 1 26 .1 in that we have dealt with a proba- 
bilistic model allowing for a rigorous treatment of the asymp- 
totic limit, but this is done at the expense of generality. Note 
that random growth processes have been investigated within 
the framework of the same ideas in ioj^ . 

To summarize, in this work we have presented the study of 
a random graph model and derived the asymptotic behavior of 
some quantities describing the clustering properties, coming 
to the conclusion that they are characterized by three regimes. 
Tab. U The picture that emerges is that as the degree expo- 
nent j3 decreases the number of triangles increases and arrange 
themselves into graphs so as to create tightly connected cores 
around vertices of progressively smaller degree, resulting in 
a power-law distribution, Eq. (|5jl. This is what we refer to as 
coagulation of triangles. In itself, this phenomenon dictates 
the abundance of recurring small patterns in the graph. 
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